Privacy-preserving Classification of Data Streams

نویسنده

  • Ching-Ming Chao
چکیده

Data mining is the information technology that extracts valuable knowledge from large amounts of data. Due to the emergence of data streams as a new type of data, data streams mining has recently become a very important and popular research issue. There have been many studies proposing efficient mining algorithms for data streams. On the other hand, data mining can cause a great threat to data privacy. Privacy-preserving data mining hence has also been studied. In this paper, we propose a method for privacy-preserving classification of data streams, called the PCDS method, which extends the process of data streams classification to achieve privacy preservation. The PCDS method is divided into two stages, which are data streams preprocessing and data streams mining, respectively. The stage of data streams preprocessing uses the data splitting and perturbation algorithm to perturb confidential data. Users can flexibly adjust the data attributes to be perturbed according to the security need. Therefore, threats and risks from releasing data can be effectively reduced. The stage of data streams mining uses the weighted average sliding window algorithm to mine perturbed data streams. When the classification error rate exceeds a predetermined threshold value, the classification model is reconstructed to maintain classification accuracy. Experimental results show that the PCDS method not only can preserve data privacy but also can mine data streams accurately.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy Preserving Data Stream Classification Using Data Perturbation Techniques

Data stream can be conceived as a continuous and changing sequence of data that continuously arrive at a system to store or process. Examples of data streams include computer network traffic, phone conversations, web searches and sensor data etc. These data sets need to be analyzed for identifying trends and patterns, which help us in isolating anomalies and predicting future behavior. However,...

متن کامل

Privacy-Preserving Data Stream Classification

In a wide range of applications, multiple data streams need to be examined together in order to discover trends or patterns existing across several data streams. One common practice is to redirect all data streams into a central place for joint analysis. This “centralized” practice is challenged by the fact that data streams often are private in that they come from different owners. In this pap...

متن کامل

Privacy-Preserving Classification for Data Streams

In a wide range of applications, multiple data streams need to be examined together in order to discover trends or patterns existing across several data streams. One common practice is to redirect all data streams into a central place for joint analysis. This “centralized” practice is challenged by the fact that data streams often are private in that they come from different owners. In this pap...

متن کامل

Privacy-Preserving Distributed Stream Monitoring

Applications such as sensor network monitoring, distributed intrusion detection, and real-time analysis of financial data necessitate the processing of distributed data streams on the fly. While efficient data processing algorithms enable such applications, they require access to large amounts of often personal information, and could consequently create privacy risks. Previous works have studie...

متن کامل

Privacy-Preserving Distributed Stream Monitoring (NDSS 2014)

Applications such as sensor network monitoring, distributed intrusion detection, and real-time analysis of financial data necessitate the processing of distributed data streams on the fly. While efficient data processing algorithms enable such applications, they require access to large amounts of often personal information, and could consequently create privacy risks. Previous works have studie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008